After completing this lesson, you’ll be able to:
Well, Frank's plan to avoid becoming the "debugging guy" failed.
His manager just assigned him to take over a project from a colleague and passed their workspace on to him. This project is to calculate the "walkability" of each address in the city of Vancouver. Walkability measures how easy it is to access local facilities on foot. The workspace will measure the distance to the nearest park, the amount of crime in an area, and other similar metrics.
The workspace currently assesses crime, parks, and noise-control areas, but it doesn't give an overall measure of walkability.
Frank needs to build on their workspace and use his debugging skills to address any problems he encounters.
Maybe he can hold that debugging workshop afterwards...
Frank opens the starting workspace (C:\FMEData\Workspaces\UseDataIntegrationBestPractices\debugging-a-workspace.fmw) in FME Workbench (2025.0.1 or later). Then, he runs the workspace to cache the data.
First, he figures out what the workspace does:
The ExpressionEvaluator transformer creates a measure of walkability that combines the values from crime, park proximity, and noise zones.
Frank inspects the parameters of the ExpressionEvaluator transformer at the end of the workspace.
It creates a new attribute called Walkability that is:
@Value(ParkDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)
With this expression, the smaller the result, the more walkable an address.
Frank assesses whether the result of the translation is correct.
Firstly, he checks the log window for errors and warnings. There are no errors, but there are many warnings, which is not a good sign:
The number of warnings in the Translation Log may differ in your workspace. These numbers can vary based on the Logging Parameters set in FME Options.
He clicks on the warnings button to filter out the warnings. The warnings say:
Null, missing, or empty string operand was found in expression '@Value(ParkDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)'. Result is set to null
He inspects the output cache on the ExpressionEvaluator (clicking the link next to the warning in the log focuses on it). He finds some addresses have a Walkability value of <null>.
He knows there is a problem. Now he has to find out where the problem is and why it occurs.
There were no errors, but the workspace's output is still incorrect. Always inspect your workspace's results to ensure you have configured it correctly.
He can tell the warning comes from the ExpressionEvaluator, but that doesn't necessarily mean that is where the problem lies.
Because he knows a null, missing, or empty string is the problem, he can inspect the ExpressionEvaluator cache to find the source of the problem. A practical way to do this is to right-click ParkDistance, CrimeValue, and NoiseZoneScore in the Table View window and sort them by ascending numeric order. This sorting puts any null or missing values at the top of the table.
Frank does this, and the sorting reveals that CrimeValue has <missing> values. So, the calculation in the ExpressionEvaluator fails because the middle value is <missing>. Now to find out why some of these features have missing CrimeValue values.
He inspects the FeatureJoiner caches because that's where he first gets the Crime data:
The FeatureJoiner does not have missing values, so he proceeds with the translation. He checks the cache for the AttributeValueMapper. This transformer sets values, so perhaps missing values are coming from it.
He inspects the AttributeValueMapper cache but sees no missing values for the CrimeValue or the crime Type attribute. There are also no missing values in the Aggregator and CenterPointReplacer caches.
What about the 3,698 features that do not have a crime; what CrimeValue do they get? He inspects the UnjoinedLeft output from the FeatureJoiner and sees that they do not have the CrimeValue attribute. That's why the ExpressionEvaluator says that there are missing values. These features do not have a CrimeValue because they don't enter the AttributeValueMapper, which assigns a value to CrimeValue.
He confirms this issue by inspecting the NeighborFinder's MatchedBase cache, which contains addresses with and without crime values. He sorts CrimeValue and sees that it has missing values here:
Due to a bug, if you are using FME 2025.0, you may not see the City, Province, and other Base feature attributes. You can check Attribute Accumulation > Merge Attributes at the bottom of the NeighborFinder parameters to ensure the transformer exposes them.
Those features that do not have a CrimeValue attribute are causing the problem, so he should give them one. To do so, he adds an AttributeCreator transformer to the workspace between the FeatureJoiner's UnjoinedLeft output port and the NeighborFinder's Base input port:
He opens its parameters and creates an attribute called CrimeValue with a value of zero (0).
He runs the workspace, which runs from the AttributeCreator to the ExpressionEvaluator. He now finds fewer warnings and that the Walkability attribute contains no <null> values. He takes note of the rounded max value of Walkability: 956.
The city has decided that parks are not a great candidate for walkability scores because there is usually a park nearby. They decided instead to include the walking distance to the nearest swimming pool.
With just a few minor updates, Frank can reuse the same workflow for swimming pools that he used for parks.
First, he adds a new reader with the following parameters:
Reader Format |
OpenStreetMap (OSM) XML |
Reader Dataset |
https://s3.amazonaws.com/FMEData/FMEData/Data/OpenStreetMap/leisure.osm or C:\FMEData\Data\OpenStreetMap\leisure.osm |
When prompted, he selects only the leisure feature type:
Then, he moves the new leisure reader near the Parks reader and connects it to the NeighborFinder's Candidate input port. Then he right-clicks on the Parks reader and selects Disable.
Frank inspects the leisure data, noticing various leisure facility types, with the type recorded in the leisure attribute.
So, he adds a Tester transformer between the leisure reader and the NeighborFinder. He sets the parameters to test for leisure = swimming_pool:
Now he updates the AttributeRenamer to use PoolDistance instead of ParkDistance. Renaming this attribute causes the ExpressionEvaluator to turn red.
To fix the ExpressionEvaluator, he opens the parameters and changes @Value(ParkDistance)
to @Value(PoolDistance)
to take account of the new PoolDistance attribute:
@Value(PoolDistance) + @Value(CrimeValue) - @Value(NoiseZoneScore)
He also does the same thing for the AttributeKeeper transformer.
He re-runs the workspace. He checks the log for warnings and errors, then inspects the ExpressionEvaluator cache.
He notices that the walkability scores are now exceedingly large due to the PoolDistance values. The new max value is 5,477,800. Something is wrong, but what?
PoolDistance is the source of the problem. There is no related log message to give a clue, and the Feature Count numbers look correct.
Frank inspects the data. He clicks on the leisure reader, and while holding the Shift key, clicks on the NeighborFinder. This step will open all the selected caches in Visual Preview.
If you have Toggle Automatic Inspect on Selection disabled, you'll have to right-click on either object and select Inspect Cached Features after selecting them both, or Ctrl-click the cache itself instead of the transformer.
He right-clicks in the Graphics view, goes to Background Map, and selects Background map off. Visual Preview shows two specks of data a long distance apart. This result is typical of a mismatch of coordinate systems.
We turn the background map off because otherwise, Visual Preview automatically reprojects data with mismatched coordinate systems. Turning the background map off lets us see that these are not using the same coordinate system.
He clicks on some features and selects the Feature Information button. In this window, you will see that the primary data has a coordinate system of UTM83-10, while the leisure data from OSM has a coordinate system of LL84.
This disparity is why the "nearest" pool to each address is such a high distance.
The obvious solution is to reproject the pools to the correct coordinate system. So, he adds a Reprojector transformer to reproject the leisure data before it gets to the NeighborFinder:
He inspects its parameters and sets it to reproject from LL84 to UTM83-10.
He reruns the appropriate parts of the workspace. He checks the log window and inspects the ExpressionEvaluator cache.
Each address now has a walkability score accounting for pools instead of parks, with a lower number being better and a higher number worse. The new (correct, rounded) maximum is 4,308.
Another day, another workspace debugged. Frank decides he had better plan that workshop.